Prevention, diagnosis and treatment of high-throughput sequencing data pathologies.
نویسندگان
چکیده
High-throughput sequencing (HTS) technologies generate millions of sequence reads from DNA/RNA molecules rapidly and cost-effectively, enabling single investigator laboratories to address a variety of 'omics' questions in nonmodel organisms, fundamentally changing the way genomic approaches are used to advance biological research. One major challenge posed by HTS is the complexity and difficulty of data quality control (QC). While QC issues associated with sample isolation, library preparation and sequencing are well known and protocols for their handling are widely available, the QC of the actual sequence reads generated by HTS is often overlooked. HTS-generated sequence reads can contain various errors, biases and artefacts whose identification and amelioration can greatly impact subsequent data analysis. However, a systematic survey on QC procedures for HTS data is still lacking. In this review, we begin by presenting standard 'health check-up' QC procedures recommended for HTS data sets and establishing what 'healthy' HTS data look like. We next proceed by classifying errors, biases and artefacts present in HTS data into three major types of 'pathologies', discussing their causes and symptoms and illustrating with examples their diagnosis and impact on downstream analyses. We conclude this review by offering examples of successful 'treatment' protocols and recommendations on standard practices and treatment options. Notwithstanding the speed with which HTS technologies - and consequently their pathologies - change, we argue that careful QC of HTS data is an important - yet often neglected - aspect of their application in molecular ecology, and lay the groundwork for developing a HTS data QC 'best practices' guide.
منابع مشابه
Strategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملSeasonal variations of microbial community in a full scale oil field produced water treatment plant
This study investigated the microbial community in a full scale anaerobic baffled reactor and sequencing batch reactor system for oil-produced water treatment in summer and winter. The community structures of fungi and bacteria were analyzed through polymerase chain reaction–denaturing gradient gel electrophoresis and Illumina high-throughput sequencing, respectively. Chemical oxygen demand eff...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملمروری برتکنیک های توالی یابی DNA (نسل اول، نسل دوم و نسل سوم)
Introduction: The DNA sequencing is the most important technique in molecular biology by which the order of the nucleotides can be identified in a piece of DNA. There are several different methods for sequencing the DNA. Now, the DNA sequencing has great importance in the medical diagnostics and other medical fields. Some methods have been invented to speed up and increase the efficiency of the...
متن کاملThe significance of gene profiling in diagnosis the cause of drug resistance in cancer
Chemoresistance is one of the main obstacles to the success of cancer treatment and one of the most important causes of death in patients. In the last decade, progress in high-throughput technologies, including microarray, sequencing, and bioinformatics has greatly resulted in cancer gene profiling and identification of biomarkers for cancer prognosis and prediction. This has greatly improved t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Molecular ecology
دوره 23 7 شماره
صفحات -
تاریخ انتشار 2014